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DISTRIBUTION OF ERROR IN LEAST-SQUARES SOLUTION OF AN OVER DETERMINED 
SYSTEM OF LINEAR SIMULTANEOUS EQUATIONS 


by C. David Miller 

Aerospace Safety Research and Data Institute 
Lewis Research Center 

SUMMARY 

Probability density functions are derived for errors in the evaluation of unknowns by 
the least- squares method for a system of n nonhomogeneous linear equations in p 
unknowns, for n greater than p. The coefficients of the unknowns are assumed to be 
correct and computational precision is assumed. The treatment deals only with errors 
due to inaccurate constant terms or due to the existence of unknowns within the physical 
system that generates the system of equations, which affect the values of the constant 
terms, and which are omitted from the equations. Columns of the coefficient matrix 
and the column of constant terms are viewed as vectors in an n- dimensional space. The 
method involves definition of an error vector and an assumption that the error vector 
will be randomly oriented, with uniform distribution throughout the n- dimensional space. 

A probability density function that gives due regard to the effect of any known or 
assumed biasing effects associated with the source of the system of equations is shown 
to be insensitive to those biasing effects. This fact justifies a substantial degree of 
utility for an approximate density function that is derived without regard to any biasing 
effect associated with the source. 

Possible applications are mentioned for use of the density functions to enhance the 
protection provided by a warning system in which critical values of unknowns represent 
the hazards for which warning must be provided. 


INTRODUCTION 

A least- squares method (ref. 1) is generally used for finding a compromise solution 
for values of p unknowns from a system of n nonhomogeneous linear equations in 
which n is greater than p. An analysis of the least- squares method that may have 
novel aspects will be described here. The analysis involves, in effect, a method of 



I 


inversion of a matrix by use of vectors in n- dimensional space. The vector method of 
matrix inversion serves as an essential groundwork for the method of statistical analy- 
sis of probable error that will be presented. 

The method of estimation of probable error begins with an assumption that the coef- 
ficients of the unknowns are dependable. It is assumed that the constant terms may be 
more or less undependable either because of inaccuracies in their measurement or be- 
cause of other effects on the constant terms than the values of the unknowns and their 
coefficients. The method involves division of the n- dimensional vector space into a p- 
dimensional and an (n - p)- dimensional subspace. The p- dimensional space is that 
spanned by the column vectors of the coefficient matrix. The (n - p)- dimensional space 
is its orthogonal complement. An error vector is defined, and it is shown that the com- 
ponent of the error vector within the (n - p)- dimensional subspace can be evaluated di- 
rectly. Statistical relations may then be developed between this component of the error 
and the component within the p- dimensional subspace. 

The analysis is performed both without and with regard to the distribution of magni- 
tude of the error to be expected because of characteristics of the source of the system 
of simultaneous equations. The two results are critically compared. 

This analysis was performed for application to a problem in infrared spectroscopy. 
The problem concerned monitoring of a gas- filled space for detection of traces of spe- 
cific toxic gases by measurement of the absorption of infrared radiation of various wave- 
lengths. Because of the impossibility of excluding the effects of many unknown gases 
that might be present in admixture with those to be monitored, a mathematical method 
was desired for predicting the probable error in the estimation of concentrations of the 
specific toxic gases, produced by the presence of the unknown gases. The mathematical 
concepts developed here are part of the method that is under development for that pur- 


SYMBOLS 

coefficient matrix of n rows and p columns 
vector defined by eq. (8) 

vector extending in same direction as A^., but with magnitude such as to yield 
unit dot product with vector Aj 

any specific unit vector 

unit vector directed same as vector A. 

unit vector extending in same direction as component of A^ orthogonal to 
each A^ vector for k * i 


pose. 
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[C] 
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E i 


E o 

E o+i 
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F( ) 

f( ) 
f(x|y) 
g( ) 

g(x|y) 
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[M] 
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n 



P 
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an intermediate vector used in Gram-Schmidt orthogonalization 

coefficient within j equation for i Ln unknown 

matrix defined by eq. (4) 

matrix defined by eq. (6) 

error vector defined by eq. (18) 

expected value of any argument within brackets 

component of error vector E lying outside subspace S Q but within subspace 
S o + i 

component of error vector E lying within subspace S 


component of error vector E lying within subspace S 
component of error vector E lying within subspace S 


o+i 


distribution function of random variable appearing within parentheses 

density function for random variable appearing within parentheses 

density function of any random variable x when given y 

density function for random variable appearing within parentheses (used in- 
stead of f( ) only when desirable to avoid confusion) 

density function of random variable x when given y (used instead of f(x | y) 
only when desirable to avoid confusion) 

(subscript) order number of an equation within overdetermined system 

measured vector, defined by eq. (9) 

column matrix of constant (measured) terms 

component of vector M within subspace S 


component of vector M within subspace 

(subscripted) constant (or measured) term in equation designated by subscript 
number of simultaneous equations in overdetermined system 
k member of a set of (n-p) orthogonal unit vectors spanning subspace S Q 
k th member of a set of p orthogonal unit vectors spanning subspace S 

r 

number of unknowns in overdetermined system of equations 

any unit vector randomly oriented with uniform distribution throughout 
n- dimensional space 


3 



n 


o+i 


u. 


J 

[X] 

py 


x ci 


x ti 


a i 


6 j 


''ai 


6 


o+i 




ratio of | Ej | to | E Q | 

total n- dimensional space 

(n - p)- dimensional subspace orthogonal to S 

Sr 

(n - p + 1)- dimensional subspace consisting of subspace S Q plus the direction 
of A! vector 

p-dimensional subspace spanned by A. vectors (i = 1 to p) 

xi_ 

j member of orthogonal set of n unit vectors spanning n-dimensional space 
column matrix of unknowns 

column matrix of unknowns to be calculated by least- squares method 
magnitude of component of vector R in direction of Uj 
value of i^ 1 unknown as calculated by least-squares method 

■j*Vi 

i ul unknown in overdetermined system of equations 

j-i* 

postulated true value of i u unknown 
angle between vectors A| and A. 

absolute value of dot product of a specific unit vector A_ and a unit vector R 

s 

randomly oriented with uniform distribution 

f h 

component of error vector so defined as to make j equation exact as illus- 
trated in eq. (17) 

algebraic error in calculation of unknown designated by subscript i by least- 
squares method, defined by eq. (19) 

fractional error defined by eq. (51) 

angle between vectors E Q+ . and A! 

limit of integration defined by eq. (C6) 

standard deviation of 


DESCRIPTION OF METHODS OF SOLUTION 

The least- squares method of solution of an overdetermined system in its usual ma- 
trix form will first be described. Then the vector method will be described and will be 
used as a foundation to describe the method of estimation of probable error. 
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Matrix Form of Least-Squares Method 


We assume the existence of a system of equations, derived from whatever sources, 
as follows: 



a ji X i 




( 1 ) 


In equations (1), p is the number of unknowns for which values are to be estimated, 
j identifies a particular equation within the system of n equations, the a^ values are 
known constant coefficients, and the m^ values are known constant terms. 

Equations (1) rewritten in matrix form are 


[A] [X] = [M] (2) 

Here [A] is a coefficient matrix including only the a^ values in the same column and 
row arrangement as in equations (1). [X] is a column matrix including only the x^ var- 

iables. [M] is a column matrix containing the values of mj. The column vector M or 
the matrix [M] will be referred to hereafter as the measured vector or the measured 
matrix. 

In the matrix form of the least-squares method of solving equations (1), or equa- 
tion (2), the following steps are now set forth in detail in order to define certain symbol- 
isms and intermediate relations that will be used later. 

(1) Equation (2) is converted to 

py-prWtM] (3) 

where 

[B] = [A] T [A] (4) 

[X c ] now represents values x c - calculated by the least- squares method, with the sub- 
script c to distinguish the calculated values from postulated true values x^ that will 
later be introduced. 

(2) Equation (3) is converted to 


[XJ = [C] [M] (5) 

where 

[C] = [B]- 1 [A] T (8) 
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Vector Form of Least-Squares Method 


In the vector form of the least- squares method an orthogonal set of unit coordinate 
vectors Uj is assumed, spanning an n- dimensional space. Each of the n equations (1) 
is interpreted as a vector equation confined to a single dimension in the direction of the 
Uj coordinate. That is, the j equation (1) may be represented by 

a jl*l a j + + a j3 x 3“i * • ' • + a iP Vi - m j“) < 7 > 

Thus the left sides, or the right sides, of equations (1) become an orthogonal set of 
vectors within the n-dimensional space. 

Vectors are now defined as follows: 


__ n 

A i Vi 

i=i 

n 

“ =2 m j a i 

j=i 

The vectors A^(i = 1 to p) must be linearly independent to allow a least- squares solu- 
tion by the vector method. Nonsingularity of matrix [B] (eq. (4)) implies linear inde- 
pendence of the vectors A^. 

From equations (7) and the defining equations (8) and (9), we see that 


( 8 ) 


(9) 



Normalization of the A^ vectors to unity gives 



( 10 ) 


( 11 ) 


We now develop a set of unit vectors A*, each of which extends in the same direc- 
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tion as the component of that is orthogonal to each for k * i. Each AV 

vector individually is the end result of a Gram-Schmidt orthogonalization (ref. 2), 
though collectively they are not the set of mutually orthogonal vectors obtained from a 
single Gram-Schmidt orthogonalization. A different Gram-Schmidt orthogonalization 
must be performed for the determination of each A^ vector, though to varying extents 
parts of the orthogonalizations may be performed in common. For determination of a 
particular AV vector, the Gram-Schmidt orthogonalization may be performed in any 
order, with only one exception; the first p - 1 mutually orthogonal vectors obtained 
must span the same (p - 1)- dimensional subspace to which are confined all the A^ 
vectors for k * i. The final (p** 1 ) member of the mutually orthogonal set must then be 
the component of the A, . vector that is orthogonal to the (p - 1)- dimensional subspace, 
normalized to unity. That p member of the set, normalized, will be the A^ vector. 

An example for the procedure of finding the vector A^., with n = 4, p = 3, and 
i = 1, is given in appendix A. 

We now define a set of vectors 


From equation (10), 


a; = , Au - L - (i = 1 to p) 

V^i 


a: - m=^x.a:. A k 
k=l 


( 12 ) 


(13) 


But since, for k * i, the vector AJ is orthogonal to the vector A k , equation (13) may 
be rewritten as 


A: • M = x^: • A. 


(14) 


From equation (12) we see that 


A? • \ = 1 (15) 

by definition, so that equation (14) is equivalent to 

x ci = A: • M (16) 
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The subscript ci is used in equation (16) to indicate that the equation yields values 
of Xj calculated by the least- squares method. It can be shown that the A? vectors are 
always identical with the row vectors of matrix [C] in equation (5). 

A sample solution of a system of equations by both matrix and vector methods ap- 
pears in appendix B. Included (eqs. (B3) and (B8)) is an example of the fact the A? 
vectors are identical with the row vectors of matrix [C]. 

STATISTICAL STUDY OF ERROR 

We assume that the source of equations (1) involved effects due to the existence of 
a true value of each x., which we denote with the symbol x^. We assume each a^ in 
equations (1) to be a precisely known constant. The mj values (often results of meas- 
urements in a physical system) we assume to be unreliable for one or the other or both 
of two reasons: (1) The physical system that generates equations (1) may actually in- 
volve the effects of more than p unknowns. (2) The entities that are measured to ob- 
tain the values m^ may not be measured with great accuracy. As an example of the 
first reason, the p values x^ might be the concentrations of p gases such as CO, 
CO 2 , H 2 O, and so on, within a mixture of gases. But additional gases could be present 
within the same mixture and could affect the entities that are measured to obtain the 
values nij . The presence of those other gases within the mixture might be unknown or 
might be deliberately neglected. We may rewrite equations (1) as 

£, a n*ti + £ s = m i (17) 

i=l 

where unknowns €j are included and are now defined as having such values as may be 
necessary in order to make the equations exact. It is, of course, to be expected that 
the £j values will usually be nonzero. For, if replacement of all x^'s in equations (1) 
by true values x^ did not result in inequalities, there would be no point in a least- 
squares solution of the overdetermined system. That is, if inequalities were not cre- 
ated, one could as well accept p of the n equations at random and reject the rest. We 
now define an error vector as 



€ j U i 


(18) 


We now wish to examine the statistical effect of the error vector E on the interrelation 
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between the and the values, that is, the influence of E upon an algebraic 
error T}„- defined as 

H.1 


^ai x ci ” ^ti 

With use of equations (17) instead of (1), equations (13) and (14) become 

H ■ “ - s X ti x i ' ^ + x 'i ■ 5 - V4 • \ + H ■ E 

k=l 


( 19 ) 


( 20 ) 


Equation (16) properly becomes 

x ti = A! • M - A? • E = x ci - A! • E (21) 

From equation (21) and the defining equation (19), 

r? ai = A! • E (22) 

Statistical distributions can be derived for the algebraic error 7 ]„., under a basic 

<11 

assumption that the vector E is randomly oriented, with uniform distribution through- 
out all the possible orientations within the n- dimensional space. These derivations are 
possible because a component of E is uniquely determined by equations (17). 

Determination of a Component of Error Vector 

The full n- dimensional space, which will be referred to hereafter as S n , may be 
resolved into two mutually orthogonal subspaces S and S . The subspace S is 
p- dimensional, and is so defined that the vector A. and, hence, A! lie exclusively 
within it for any value of i. The subspace S Q is (n - p)- dimensional. Let the compo- 
nent of E lying within subspace S Q be denoted by E Q . As will be shown, this compo- 
nent is uniquely determined both as to magnitude and direction by equations (17). Nei- 
ther magnitude nor direction of Ep, the component of E within the subspace Sp, can 
be determined. However, with |E q | known, useful statistical relations can be devel- 
oped. 

For any value of i, we will also consider an (n - p + 1)- dimensional subspace S Q+ . 
that will include the entire subspace S Q and, in addition, the direction of the A! vector. 
We will consider the interrelations of E Q , Ej (the component of E lying outside sub- 
space S Q but within subspace S Q+i , i.e., in the direction of A!), and E Q+i (the compo- 
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nent of E lying within subspace S Q+ ^). Thus, 



(23) 


With | E q | known exactly, we will show that a statistical distribution of 7 / ai may be 
derived. 

From equations (8), (9), (17), and (18) 



Vti + E = M 


(24) 


or 


Vti + E p + E o = 5 p + < 25 > 

where Ep and are respectively the components of E and M within the subspace 
Sp, and M q is the component of M within the subspace S Q . 

Because of the mutual orthogonality of S and S , we may separate equation (25) 

Jr ^ 

into two equations, 




(26) 


and 


M o 



(27) 


We see from equation (27) that with the vector M and a set of mutually orthogonal unit 
vectors spanning the subspace S we could determine the vector component E direct- 
ly. That is, if we denote that orthogonal set of unit vectors as Oj^k = 1 to n - p), 


E = M = 
o o 


M 


°k°k 


(28) 


In many cases, much saving of computation time may be effected with use of an indirect 
method suggested in personal communication from Lynn U. Albers. This method uses a 
set of mutually orthogonal unit vectors spanning the S subspace, which we denote by 

r 
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P k (k = 1 to p). The vector component Mp is determined and subtracted from the vec- 
tor M to give M q or E Q . That is, 

p 

E 0 =M 0 = M-Mp = M-2jM. P k P k (29) 

k=l 


The P k vectors may be obtained by a Gram- Schmidt orthogonalization of the 
vectors. In the example presented in appendix A, with n = 4 and p = 3, the P k vec- 
tors might be 


II 

(from eq. (A4))^ 

= ^2 

(from eq. (A2)) >■ 

1! 

GO 

(from eq. (Al))^ 


(30) 


The O k vectors may be found by the following procedure : 

(1) Select a set of n - p unit vectors U k randomly oriented within the space S n . 
Alternatively, use Uj = u^, U 2 = Ug, and so on. 

(2) Continue the Gram- Schmidt orthogonalization by which the P k vectors were ob- 
tained to include the U k vectors for k = 1 to n - p. 

(3) Denote the n - p additional unit vectors resulting from the continuation of the 
Gram-Schmidt orthogonalization by O k (k = 1 to n - p). 

Use of the P k vectors, with equation (29), would involve much less computation in 
any case where a single set of simultaneous equations is to be solved by the least- 
squares method. In cases where many sets of overdetermined simultaneous equations 
are to be solved, always with the same Aj vectors but with different M vectors, and 
with n - p smaller than p, the method using the O k vectors and equation (28) might 
involve less computation. 


Probability Density Function for Ratio of Magnitudes 
of Two Components of Error Vector 

We designate as the angle between E . and A.'. We see that 

OF 1 O+X 1 

l a iH x ii- E M 2 <Hil coBi, ofi < 31 > 
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and that 


Ej = E 


o+i 


sin 6 


o+i 


(32) 


Thus 



cos 0 o+i 


sin e Q+i 


(33) 


Because the ratio 



will be used repeatedly, we introduce here the symbol 


r 



(34) 


Later we shall also need the following relation, from equations (23) and (34): 







+ r 


(35) 


It can be shown easily that the basic assumption, that the orientation of E is uni- 
formly distributed within the space S n , implies that the orientation of E Q+ . is uniform- 
ly distributed within the S-, . subspace. Under that assumption, we now wish to derive 
a probability density function for r of equation (34). We will do so disregarding for the 
moment the fact that | E | is known. We will make the further assumption that the mag- 
nitude of E and the orientation of E are independently distributed and, hence, that the 
magnitude of E and the value of r are independently distributed. 

We will derive the density function for r indirectly, using the density function for 
the absolute value of the dot product of a fixed or specific unit vector A_ with another 
unit vector R that is randomly oriented, with uniform distribution, within an n- 
dimensional space. For convenience, we will use the symbol 

0 = |A g • R| (36) 

With such notation, appendix C presents a derivation of the following equation for the 
expected value E[/3] : 


12 



E[/3] = 




n-1 



k (-l) S+k+1 


k=l 


's = 1 for n even\ 
s = 0 for n odd / 


(C12) 


The density function for /3, also derived in appendix C, is 



fn-2 1 

/9\S 


MM 

’ k (-D s+k+1 

m = ; 



\ fl 9 

. 

k=] 




(n-3)/2 


( s = 1 for n even\ 
s = 0 for n odd / 


(C16) 


Other functions of /3 derived in appendix C, but which will not be needed here, are the 
distribution function F(/3) (eq. (C18)) and, for the special case of n = 100, an approxi- 
mation of f(/3) as one side of a normal distribution (eq. (C19)). All these equations for 
functions of /3 will also apply if A is a randomly oriented unit vector according to 
any type of distribution, so long as R has the uniform distribution. 

Equation (C16) gives the density function for cos 6 . upon substitution of n - p + 1 

for n, and substitution of cos 0 Q+ ^ for /3. From the density function for cos 0 Q+ ^ we 
can deduce the density function for r. For that purpose, and for several later applica- 
tions, we will need the standard formula for change of variable in the probability density 
function, 


g(y) = f(x) 


dx 

dy 


(37) 


where y is a monotone increasing or decreasing function of x, g(y) is the probability 
density function for y, and f(x) is the density function for x. If it is wished to elimi- 
nate x from the expression obtained for g(y), x may, of course, be replaced by its 
equivalent in terms of y. (See ref. 3 or other text on mathematical statistics for deri- 
vation of eq. (37). ) 

From equations (33), (34), (C16), and (37), the probability density function of r is 
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/s - 1 for n - p + 1 even\ 
\s = 0 for n - p + 1 odd / 


(38) 


Values of g(r) according to equation (38), for n = 100 and p = 10, are plotted in 
figure 1. The plotted points fit, within the plotting accuracy, the following probability 
density function: 


g(r) = 7. 5484 exp 



(39) 


Note that the standard deviation 0. 1057, as it appears in equation (39), is approximately 
(n - p + i)"l/2. -phis condition should not be expected for small n - p. 



r 


Figure 1. - Density function for r with prior density function of \1L . J disregarded for 
n = 100 and p = 10. A| 


14 



Probability Density Function for Algebraic Error 


An approximate expression for the probability density function of T)„- given |E I 

dl o 

may now be obtained with use of equation (22) rewritten with use of equation (34) as 


’>ai=l A il A ii' E =±l A i 


|Eil 


= ±r|A!| 


( 40 ) 


and with use of equations (37) and (38) or (39), with due allowance for the fact the density 
function of 77 . must be normalized for both plus and minus values rather than for plus 
only as with equations (38) and (39). An assumption will be made that the density func- 
tion of r given |e q | approximates the density function of r. That is, 



(41) 


The accuracy of this approximation will be examined later. With use of the relations 
indicated, and with normalization for both positive and negative values of u q4 , 

ftl 


f 








n-p- 1 

TT 


k (-D s+k+1 



(p-n- 1)/2 


(s * 1 for n - p + 1 even\ 
s = 0 for n - p + 1 odd 

■°° < ^ai < “ 

VO ^ r < «> 


(42) 


or, according to equation (39), for n = 100 and p = 10, 



( 43 ) 



i f ^ai 
2 yO. 1057 I a: 



Here also note that the standard deviation approximates |A! [ |E q | (n - p + l) - *^ for 
large n - p. 

In order to derive an exact equation corresponding to the approximate equation (42), 
a prior distribution represented by a density function f(| . [) must be considered. 

This prior distribution depends on the source of equations (1). It relates to the proba- 
bility of existence of any given value of |E q+ .| if the values of r and |e q | are unknown 

or disregarded. An equation for f^* 1 1 E Q and an exact equation corresponding to the 
approximate equation (42) are derived in appendix D with due regard to the prior density 
function f^|E Q+ . |^. They are as follows: 



The density. function g(r) according to equation (38) and the density function 

f^r | |E q |^ according to equation (D7) are identical if the density function of In |E q+ J is 

made equal to a constant. Jeffreys (ref. 4) argues that, in the absence of any informa- 
tion to the contrary, the unbiased distribution for a random variable constrained to non- 
negative values should be that giving uniform distribution to its logarithm. The identity 
of the right-hand sides of equations (38) and (D7) when the density function is uniform for 
In | E Q+i | is in harmony with that opinion. 

With a given physical system that generates the same system of equations (1) many 
times, but with different nij values from one time to another, a histogram could be 
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constructed for the frequencies of | E Q | values determined from time to time with use 
of equation (28) or (29). From the histogram, an expression for the density function 
f(jE Q |j could be found empirically. From that density function f(jE Q |j, a density func- 
tion for could be derived for use in equations (D7) and (D8). 


Density Function of r as an Approximation for Density Function of r Given 



We will now use equation (D7) to obtain an estimate of the degree of accuracy of 
equation (41). In order to do so, we will assume that | E Q+ J is distributed as the posi- 
tive side of a standard normal distribution curve. That is, the following density function 
is assumed: 


f 



(44) 


Under that assumption, equation (D7) becomes 


C ( r ||E 0 l)=- 


(l + r 2 / P ^ exp " ^ (l E 0 1 Vl + r 2 ) 


r u 2 \(p- n )/ 2 


I ^1 + r j exp 

--(|e o | Vl + rJ 

0 



(45) 


dr 


The solid curve in figure 2(a) represents values of density function plotted on a log- 
arithmic scale against values of r according to equation (38) for n - p = 10. Also 
shown for the same value of n - p, as the plotted points, are probability densities ac- 
cording to equation (45) for values of |E q | equal to 0. 1, 1.0, 2.0, and 3.0. These 
values of |E |, of course, can be considered as multiples of the standard deviation of 
|E 0+i | , which by equation (44) has been assumed to be one. 

From the figure we see that, for n - p = 10, in the use of g(r) according to equa- 
tion (38) as a substitute for g^r 1 1 E Q according to equation (D7): 

(1) Equation (38) is accurate within a range from 30-percent underestimate to 100- 
percent overestimate of the true value of the density function, for all values 

0. 1 < | E Q | < 3. 0, for values of r < 0. 5. 

(2) Equation (38) is even more accurate for all values of r < 1. 0 and values 

0.1 < |e d | < 1.0. 

(3) For large |E q | and large r, where equation (38) is not very accurate, it is at 
least conservative in the sense that it overestimates the density function. 
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Figure 2(b) is a similar plot for n - p = 100. It shows that equation (38) is a good 
approximation except for the combination of conditions |E q | > 2.0 and r ^ 0. 3. The 
fact that this combination of conditions is relatively rare may be seen from the ordi- 
nates in the figure. It is clear from comparison of the two parts of figure 2 that equa- 
tion (38) is increasingly exact for use as if it were an expression for g(r | | E Q 1 1 instead 
of an expression for g(r) with increasing value of n - p. ' 

A numerical example, with use of equations (38) and (42), for n = 6 and p = 3, is 
included in appendix E. Also reported in that appendix are the statistical results of 
least- squares solutions for 10 000 similar systems with random values of a^, €j, and 
x^. The results of the 10 000 solutions are plotted and compared there with a curve 
representing equation (38). The results from the random problems agree with equa- 
tion (38) within the plotting accuracy. The results show, therefore, that with the density 
function f(|E 0+ j) that existed for the method of random selection of values of €j, equa- 
tion (38) gives on the average quite accurate values of g/r 1 1 E Q even with n - p + 1 as 
small as four. 

To the same extent that g(r) may be used as an approximation for g^r | | E Q |^, equa- 
tion (42) may be used as an approximation for equation (D8). This condition is true be- 
cause equation (41) was used in derivation of equation (42). 


Statistical Considerations Regarding an Anticipated System of Equations 

Relative to guidance of design for a physical system that is expected to generate an 
overdetermined system of simultaneous equations, we now wish to consider some statis- 
tical relations that may be expected within the system of equations before it has been 
generated. 

We postulate that the following details only are known regarding the system of equa- 
tions (17) that are expected to be generated: 

(1) The values of n and p 

(2) All of the a^ values 

(3) Possible or probable x^ values 

(4) Possible or probable values of | E | (eq. (18)), which may be inferred from known 

or postulated facts regarding the potential sources of error 
In particular, no values of m^ as yet exist and, in fact, no overdetermined system of 
simultaneous equations yet exists. 

We wish to consider the magnitudes of the algebraic errors tj . that may be antici- 
pated when the simultaneous equations have been generated and solved, in relation to the 
anticipated magnitudes of x^ and E. 



From equations (11), (12), (15), and (22), 
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(46) 


The denominator in the final form of equation (46) is necessarily positive (eq. (15)). 

Note that this equation applies in general without need for specific known values of nij . 
The expected absolute value of the dot product of unit vectors in the numerator in the 
final form of equation (46) should have the value given for E[/3] in equation (C12). Hence, 
the expected absolute value of the algebraic error is 


E 
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Ki 


A ui> 


(47) 


or 


E 



E[fi] |E 
cos | A. 


(48) 


where is the angle between the vectors A? and A^. 

We now wish to show that | i7 a j | generally tends to be smaller with larger values of 
n, for a given number of unknowns p. In equation (48), the ratio |e|^/|aJ should not be 
systematically affected by the value of n. But E [/3] will be smaller with large n ac- 
cording to equation (C12). So if we can show that cos tends to be greater with larger 
n it will necessarily follow that E Jj ? 7 ai jj tends to be smaller with larger n. This fact 
may be shown as follows. 

In the absence of contrary information, A^ may be regarded as randomly oriented 
with uniform distribution within the S n space. Consider a (p - 1)- dimensional subspace 
Sp_j, spanned by the A. u k(k^i) vectors, and its orthogonal complement S n _ The 
vector A ^ has components that we now denote by A p and A n _ ^ within subspaces 
Sp_j and S n _ p+ p respectively. A n _ p+1 extends in the same direction as Af. Hence, 


COt Qfj 


v n-p+l ' 




(49) 
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We may now infer whether the expected value of cot a j is an increasing function of n 
with fixed p. To do so, we may use equation (38), recognizing that Sp is analogous 
with S Q , S n _ p+1 with S p , A. with E, A n _ p+1 with E p , and A p _ j with E Q . If we can 
infer from equation (38) that the expected value of | E p [ | E 0 1 increases when n and p 
are increased by the same integral value, it follows that the expected value of cot of 
equation (49) increases when n is increased with p unchanged. 

Now consider a special case in which the A u ^ vectors, and hence the AV^ vectors 
also, are an orthogonal set. Then 



(50) 


where r^ is the same as r of equation (34). But by inspection of equation (38) we see 
that the expected value of each r. will be unchanged when n and p are increased by 
the same integral value. So increase in the number of summed terms under the radical 
in equation (50), with unchanged expected values of the individual terms, means that the 
expected value of |Ep|/|E 0 | increases. It follows that the expected value of cot on of 
equation (49) increases with increase of n without change of p, E[cos on] increases, 
and E [j | J decreases. 

An objection might be made that the values of r^ are not independent. But now con- 
sider, in turn, the expected values of r iA , r iB , r kA , and r kB (k * i), for two systems 
A and B with the same value of n A - p A and n B - p B but with p B larger than p A> 
By equation (38), with no value given for any r^ or r kB , the expected values of r^ A 
and r iB mus t be the same. Th en, given the same values of r iA and r iB , the values 
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similarly defined for B, must be the 


and 


r kB as 


E, 


kBl/ 1 (o+i)Bl 


same. Now if we designate r^ as I E kA I / I E (o+i)A 
we see that the expected value of either r kA or r kR is given by equation (38), with 
n - p increased by one. It follows that the expected values of r kA and r kB are the 
same, hence the expected values of Ie^I and |E kR | are the same, and finally the 


expected values of r^ given r iA and r kB 


r iB are the same * This procedure 


could be continued for expected values of each rj A (j * i or k) and the corresponding 
expected values of r^g. It is clear, therefore, that the interdependence of the values of 
r iA = 1 to p A ) and of the values of r^ B (i = 1 to p B ) does not vitiate the foregoing 
argument. 

In the derivation of a density function for the error in the least-squares value of an 
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unknown, it will prove convenient to use the absolute value of 77 . expressed as a frac- 

dl 

tion of the absolute value of x^.. That is, we will use an absolute fractional error 


*?i = 


V. 


ai 


*ti 


(51) 


As an example, for the specific case of n = 100, we may use the normal approxima- 
tion for f(/3) given by the equation 


f(/3) = 7. 876 exp 



(C19) 


derived in appendix C. With equations (C19) and (46), the probability density function of 
77 ^ may readily be found. With use of the relation expressed by equation (37), the proba- 
bility density function for is 


gfaj) = 
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P=(Vi I XyAi I cos aj/lEl 


(52) 


Equation (52) is the ordinary normal distribution equation, normalized relative to posi- 
tive values only, for the special case of n = 100. That is, 


SiVj) = 


Vi 


Vi 


exp 



(53) 


where 


r 0.1013 III 
^i cos 


(54) 


Equations (52), (53), and (54) depend on the value of | E | . They are useful for judg- 
ing the accuracy to be expected in the least- squares solution of an overdetermined sys- 
tem of simultaneous equations that will be generated by a given physical system with 
n = 100 when the values of jx^Aj and j E | that are likely to be encountered are known or 
postulated. Analogous expressions might be found for getting comparable results with 
other dimensionalities. 
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APPLICATIONS 


Some types of physical systems provide over determined systems of linear simulta- 
neous equations on a continuing basis. The coefficients a^ may be unchanging and 
highly dependable. The m^ values may change from one time to another. Moreover, 
the nij values may be more or less undependable either because of inaccuracies in their 
measurement or because of other effects on the nij values than the effects of the a^ 
and x^ values. In such systems, the methods of error analysis that have been present- 
ed may be useful in (1) the design stage and (2) the use stage. 

In the design stage, the engineer would want to use the analysis to help him judge the 
degree of dependability that might be expected from each of several proposed systems. 

He may know, or be able to estimate, the magnitudes of extraneous effects upon the m^ 
values. He will want to be able to tell the user of the equipment how dependable the solu- 
tions for the unknowns may be under the condition of those anticipated extraneous effects 
being present. 

Equation (48), for example, would be useful in the design stage. That equation yields 
the expected fractional error due to the anticipated extraneous effects, that is, the mag- 
nitude of the E vector. Equation (48), then, would allow an estimate of the accuracy to 
be expected for a given value of n, or could be used with a series of values of n to de- 
termine the necessary value of n to achieve a desired accuracy. The engineer, of 
course, would have the responsibility of judging whether the basic assumption of a uni- 
formly distributed orientation of the error vector E should reasonably be applied to the 
specific system concerned. 

After a physical system has been designed and constructed and is in use, for ex- 
ample, for monitoring concentrations of toxic gases, the equations that have been devel- 
oped would allow warnings of several types. Suppose a decision were made that a criti- 
cal concentration for gas i would be x j( cr ^)* Then equation (16) could be used to sound 
a warning whenever the value x ci were as great as x^ crit j. That warning could be 
accompanied by an auxiliary signal if, at the same time, an effective standard deviation 
as great as x^ from equation (16) were indicated. (Such an effective standard deviation 
would be the product 0. 1057 1 A? | | E Q | in eq. (43), or some similar parameter in an 
equation analogous to eq. (43) for some other system than those to which eq. (43) ap- 
plies. ) In that case, the probability would be about 0. 16 (positive values of error only) 
that the value x ci could be caused entirely by interferents . That is, in about 16 per- 
cent of cases the value of x ci , equal to x ^ plus the error rj^, could equal or exceed 
x i(crit) even wi *h x ti •= 0. With this probability, the user of the warning equipment 
would, of course, be less disturbed by the warning than if the indicated standard devia- 
tion of tj • were much smaller. 

al 

A most serious and urgent warning might be given at any time when x ci minus the 
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effective standard deviation were in excess of Regardless of the value of x ci , 

a warning might be given that the system could not successfully monitor the concentra- 
tion of gas i whenever an effective standard deviation greater than x^ cri ^ was shown. 

These possibilities of continuous error monitoring should add greatly to the protec- 
tion provided by a monitoring system. 


CONCLUDING REMARKS 

Expressions have been derived for the probability density function of the ratio of 
absolute value of error in determination of an unknown to a uniquely determinable com- 
ponent of the error function that exists within the system of equations. These expres- 
sions are for both known and unknown prior distributions of error. It has been shown 
that the effect of the prior distribution is usually small. The analytical results have 
been well confirmed by Monte Carlo results. 

A method has been presented by which an analytical analysis can be made of the 
error to be expected in least- squares solutions of overdetermined systems of linear 
simultaneous equations that are expected to be generated by a given physical system. It 
has been shown that, other things being equal, including a constant number of unknowns, 
the expected error diminishes with increasing number of simultaneous equations. A 
method has also been shown by which an estimate can be made of the expected value of 
error as a function of the number of simultaneous equations and the magnitudes of error 
sources expressed as an error vector. It has been shown that a normal distribution is 
approximated for the error for the special case of 100 simultaneous equations with 10 
unknowns. 

Lewis Research Center, 

National Aeronautics and Space Administration, 

Cleveland, Ohio, January 24, 1972, 

111-05. 
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APPENDIX A 


EXAMPLE FOR METHOD OF DETERMINING A^ VECTOR 

A specific example will be solved for determination of the vector, with given 

A ui vectors (i = 1 to 3), in four- dimensional space. A Gram-Schmidt orthogonaliza- 
tion will first be performed with use of the two A^ vectors, for k =/= 1. This orthog- 
onalization could be performed in either of the two possible ways. Then the orthogonal- 
ization will be continued to find the third member of the mutually orthogonal Gram- 
Schmidt set, which will be the A ;i vector. 

The three A^j vectors assumed are 

A u i = 0. 258 u^ + 0. 516 Ug - 0. 258 ifg - 0. 775 u^ 

A u 2 = -0. 365 u x + 0. 548 Ug - 0. 183 Ug + 0. 730 u^ 

A u 3 = + 0. 730 Ug + 0. 365 ifg + 0. 183 u 4 



As a first step we find a vector A*g. That vector is the component of A u g that is or- 
thogonal to the one- dimensional subspace to which the vector A u g is confined, normal- 
ized to unity. That is, 



A u2 ~ ^u2 * ^13^3 
A u2 ' A u2 ' A u3 A u3 I 


= -0. 531 u^ + 0.366 u 2 - 0.291 Ug + 0.707 u 4 (A2) 

We next find a vector A^. That vector is the component of A^ that is orthogonal to 
the two-dimensional subspace to which the vectors A u g and A u g are confined, nor- 
malized to unity. That is, 

£* _ A ul ~ A ul ‘ A u2 A u2 ~ A ul ’ A u3 A u3 

V»sl 


= -0. 138 + 0. 538 Ug - 0. 561 Ug - 0. 614 u 4 (A3) 

For p > 3 this procedure would be continued in the same manner. For p = 3, the 
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procedure is terminated with 



APPENDIX B 


NUMERICAL EXAMPLE OF SOLUTION OF AN OVER DETERMINED SYSTEM OF 
SIMULTANEOUS EQUATIONS BY BOTH VECTOR AND MATRIX 
FORMS OF LEAST-SQUARES METHOD 


A vector solution will be described in detail for a system of six simultaneous equa- 
tions with three unknowns. Then the matrix solution for the same problem will be 
shown. The a^ and values were selected by a random procedure, details of 
which are immaterial at this point. 

In general, for the vector solution, a series of values of any vector Vj(i = 1 to k) 
will be represented by a 6 x k matrix that will be designated [V] and the m vectors 
will be omitted. A series of scalars like v^(i = 1 to k) will be represented by a column 
matrix designated as [v]. 

The a- values are 


-0.3809 

0. 8993 

0.2657 

-0.3644 

-0. 1383 

-0.3259 

-0. 5705 

-0. 0159 

-0. 1912 

-0.3020 

-0. 6430 

-0. 6428 

-0. 1861 

-0. 7559 

0.2378 

-0. 8513 

0. 7626 

-0.9064 


(Bl) 


The nij 


values are 


[M] 


1.3737 
-0.0505 
0.2161 
-0.97711 
-0.3574 
. 0.3180. 


(B2) 
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The following A! vectors were determined with use of equations (11), (12), and 
(Bl), by the method described in appendix A: 


-0. 6036 

0.2697 

0. 6276 

-0. 2753 

-0. 1293 

-0.0395 

-0.7092 

-0. 1587 

0.3490 

0. 0025 

-0.3009 

-0.4630 

-0.7948 

-0. 4671 

0. 6588 

-0. 1386 

0. 2499 

-0. 4775 


(B3) 


By equations (16), (B2), and (B3), the calculated values x c - are 


[ X J = 


-0. 7309 

0. 8832 

1. 0047 


(B4) 


Solution of the same problem by the matrix method will now be described. Matrices 
[A] and [M] in equation (2) are, of course, the same as the matrices in equations (Bl). 
The matrix [X] in equation (2) is simply 


[X] = 


The matrix [B] of equation (4) will therefore be 



1.4539 

-0. 5974 

1 . 

0482 

[B] = [A] T [A] = 

-0.5975 

2.3944 

- 0 . 

1706 


1. 0482 

-0. 1706 

1 . 

5046 


(B5) 


(B6) 
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The inversion of matrix [B] in equation (B6) is 


1.5941 0.3212 -1.0741 

[B] - 1 = 0.3212 0.4858 -0.1687 (B7) 

-1.0741 -0.1687 1.3937 


Finally, matrix [C] of equation (6) is 



-0. 6036 

-0.2753 

-0.7092 

0. 0025 

-0.7948 

[CJ = [B)- 1 [A] T - 

0. 2697 

-0. 1293 

-0.1587 

-0.3009 

-0. 4671 


0. 6276 

-0. 0395 

0.3490 

-0. 4630 

0. 6588 


-0. 1386 
0. 2499 
-0. 4775 


(B8) 


Note that the row vectors in matrix [C] of equation (B8) are the same as the A! 
(column) vectors in equation (B3). Hence, the values x^ determined by the matrix 
method (eq. (5)) will be the same dot products as by the vector method and equation (B4) 
will represent the solution by the matrix method as well as by the vector method. 
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APPENDIX C 


STATISTICAL DISTRIBUTION OF ABSOLUTE VALUES OF DOT PRODUCTS 
OF RANDOMLY ORIENTED VECTORS 

The expected absolute value of a dot product, as well as the statistical distribution, 
will be sought for two unit vectors A and R within n-dimensional space. The results 
sought will be for a special case of a more general treatment by Lord (ref. 5). The unit 
vector A„ will be treated as fixed, though the result would be the same if it were ran- 
domly oriented with any distribution of orientation. The unit vector R will be assumed 
to have a uniformly distributed orientation. 

The method of finding the expected absolute value of the dot product, /3 according to 
equation (36) will be discussed with reference to sketch (a), which, of course, applies 



only to three-dimensional space. The method, however, will be extended to n- 
dimensional space. Integrations will be applied to the surface of a hemisphere that is 
centered about an axis in the direction of Uj. The specific unit vector A g will be 
taken as the vector u... 

The random unit vector R is shown, extending in one of the infinite number of pos- 
sible directions. The magnitudes of its components in the directions of Uj, Ug, and 
are Xj, X 2 , and Xg. A surface element dS is shown, which contains the terminus of 
R. The basic assumption of uniformly distributed orientation of R may now be more 
explicitly stated as an assumption that all surface elements of equal area on the hemi- 
sphere will have equal probabilities of enclosing the terminus of R. 
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Now, 


/3=X 1 (Cl) 

so the expected value of /3 is 

I x 

E[/3] =-2 (C2) 

dS 


where ^ indicates integration over the entire surface of the hemisphere. 

A standard formula for determining the area of the Xj centered hemispherical sur- 
face of unit radius in rectangular coordinates, in three dimensions, is 



in which the denominator of the integrand is equal to the numerical value of Xy So, 
from equations (C2) and (C3), 



(C4) 
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Extension of this formula to n dimensions gives 



where 



/ 44 o 

/i- 1 \ 


«t n 

A-xx 

X Xj = 0 for ‘=2 

(C6) 

r 

j=2 

Vi =2 / 



If integrations in numerator and denominator are performed concurrently, the over- 
all task of integration becomes quite simple. Constants can be cancelled at every step 
of the process after the first. Thus, 
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and so on until we have 


E 



£ n_ 2 jy 

h dX 2 


(C9) 


(CIO) 


(Cll) 


or 


E[/3] = 


© 


n-1 


yTn 




rem- 


.(- 1 ) 


s+k+1 


/s = X 
\s = 0 


= 1 for n even\ 
for n odd / 


(C12) 


k=l 


Now, 


E[/3] = £ /3f(/3) d/3 

rir/2 

= J Q f(/3) cos 0 sin 9 d9 (C13) 

where f(/3) is the probability density function of /3. 

If any of the expressions for E[/3] in equation (C12) contained Xp that is cos 6, 
that expression could be equated to the right side of equation (C13) and the result could 
be solved for f(/3). Unfortunately, such is not the case. But an expression may be 
found for f(/3) as follows. 
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Equation (C5) can be regarded as applying to the entire hypersphere if both numer- 
ator and denominator are considered to be multiplied by 2. Then, for the expected 
value of /3, an integration relative to any is as valid as an integration relative to 
Xj. Hence, instead of multiplying the innermost integrand of the numerator by Xp we 
could have multiplied the integrand by Xg in the position just after the first integral 
sign in the numerator of equation (C5). Then the integration shown in equation (C7) 
would have been omitted. Equations (C8) to (CIO) would have had identical numerators 
and denominators. We would have arrived at 


E[/3] =. 


■/v 


X 2 dX 2 




(C14) 


dXr 


or 


E[/3] = E[cos 9] = 


rv/2 

L cos 6 sin ll '“9 de 


,j_n-2 . 


fo 


ir/2 


(C15) 


sin n-2 0 d0 


in which it is justified to use 9 = cos" *Xg as if it were 9 = cos” *Xp By comparison 
of equations (C12), (C13), and (C15), it may be seen that 


m = 




n-2 




.(-D ; 


s+k+1 


/ 2\( n "3)/2 /s = 1 for n even\ 

\ ^ ) \s = 0 for n odd / 


(C16) 


And the distribution function is 


F(j8) - 
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"n-2 

TT 

k=l 


.(- 1 ) 


s+k+1 


f. 


zr/2 


r»— 9 

sin a ” cp dcp 


s = 1 for n even\ 
s = 0 for n odd 1 


k9 = cos _1 /3 


(C17) 


or 
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(!) 


F(j3) = s +(-) < /3 sin s 0 




(n-s-3)/2 




1 + 




sin 2r S 


r=l 


2r+s 


n 

k=l 


r(-« 


s+k+1 


-sff. 


/s = 1 for n even\ 
s = 0 for n odd ) 
\9 = cos - */3 


/ 


(C18) 


We may disregard any question regarding the II factors or summations in equations 
(C12), (C16), and (C18) at certain very low values of n because those values of n are 
all below the values of interest in this report. 

For n taken arbitrarily as 100, the solid curve in figure 3 represents equation 
(C16). The plotted points in the same figure are calculated values obtained with the fol- 
lowing probability density function: 


fQ3) = 7. 876 exp - — (- — - — \\ (C19) 

L 2\0. 1013/J 

The agreement seems to be sufficiently precise that, for n = 100, equation (C19) may be 
used for practical purposes in lieu of equation (C16). Note that the standard deviation 
0. 1013 in equation (C19) is approximately (n - 3)"^^. 
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APPENDIX D 


EFFECT OF PRIOR DISTRIBUTION OF MAGNITUDE OF ERROR VECTOR 

Equations (38) and (42) were derived for the case where it was assumed that nothing 
was known about the error vector or its components except its uniform distribution in 
space. Now we wish to find a density function for r for the situation where | E Q [ is 
known exactly. We will show that a prior density function of the magnitude of the com- 
ponent of the error vector |E Q+i | must be taken into account. By prior density function 
we mean a density function that is inherent in the method of generation of the system of 
equations (1). This prior density function would depend on the source of equations (1), 
and might differ for any two different physical systems that could give rise to the same 
system of equations. 

We shall develop here equations analogous to (38) and (42) that will include consider- 
ation of the prior density function f^E Q+ . |Y That is, we wish to derive the probability 
density function of r given |E q |, where r is the ratio |eJ/|E |, |eJ is the magni- 
tude of the component of the error vector E lying in an arbitrary direction (labeled i) 
within the subspace Sp, and |e q | is the known magnitude of the component of the error 
vector E q lying in subspace S Q . Again the basic assumption that is to be employed 
here is the uniform distribution over space of the direction of the error vector E, in 
particular, that portion E . of E lying in the (n - p + 1)- dimensional space consisting 
of the addition of the i direction within the space Sp to the totality of the space S Q . 
This (n - p + 1)- dimensional subspace is denoted S Q+i . 

By the uniform distribution hypothesis, the direction of E Q+i is uniformly distrib- 
uted over S Q+i so that the ratio IeJ/Ie^J can be identified with (3 and the probabil- 
ity density function of this ratio is identical with that given by equation (C16) where n 
takes on the value n - p + 1. This condition of uniform distribution of direction of E Q+i 
was used to derive equation (38) for the density function g(r). 

We assume that the prior distribution represented by the density function f^| E q+ . 
is independent of the direction of E Q+ ^, which implies that it is independent of r. 

Hence, 

f ( r H S o + il) = e(r) < D1 ) 

where g(r) is as given by equation (38). 

The quantity that we wish to determine is f^r 1 1 E Q |^ where | E Q | (and not | E Q+ J ) is 
known. Using equation (37) once again, with equation (35), we can write 
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To arrive at an expression for f^E Q+ J |E q |^, which appears in equation (D2), we 


we apply the rule of Bayes, namely, 
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We now need an expression for f( 


(M'M’ 


which appears in equation (D3). As 


that equation will be used for substitution into equation (D2), the expression for 


(MM 


must be for given r, namely, the specific value of r in the left side of 


equation (D2). So, with | E q+ ^ | fixed and with use of equations (37) and (35) under the 
assumption that | E q+ ^ | and r are independently distributed, we get 


(|E 0 ! | |E 0+i |) =er(r) 


dr 


d E, 


-+ E I *« 

r E I 


(D4) 


where g(r) is as given by equation (38). 

Now, substituting f^jij | |E o+i jj from equation (D4) into equation (D3) and then 
substituting f/[E Q+ J I [E 0 [jfrom equation (D3) into equation (D2), we get 
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E, 
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: ( r H E oi) = 


- . f) 

1 + ^ g(r) 

- V 0+17 
,2 

'|S 0 I 


l E o + lHE 0 |Vl+r 2 



(D5) 


'(iVil) 




7ITT g<r) d|E o«! 


or, with use of equation (35), 



If we now substitute from equation (38) and simplify, 
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Equation (D7) may be replaced by equation (38) even for use with given |E q | if we have 

reason to assume that f ^r 1 1 E Q is only negligibly sensitive to f^| E Q+ . . 

With the use of equation (D7), an exact equation corresponding to the approximate 
equation (42) may be obtained in the same way as before. The result is 




(D8) 
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APPENDIX E 


STATISTICAL RESULT OF ERROR DETERMINATIONS IN SOLUTIONS OF 
10 000 SYSTEMS OF SIMULTANEOUS EQUATIONS 

As a Monte Carlo verification of equation (38), least- squares solutions were exe- 
cuted for 10 000 systems of simultaneous equations. Each system included six equations 
and three unknowns. Each of the 10 000 systems involved use of 27 outputs from a ran- 
dom number generator, 270 000 random numbers in all. Each random number was 
within the range from minus one to plus one, with rectangular distribution within that 
range. 

Least- squares solutions in both vector and matrix forms were shown for an over- 
determined system in appendix B. That system was one of the 10 000 used in the Monte 
Carlo analysis now to be described. As an example, discussion of that specific problem 
will now be continued. 

For this example, 18 of the 27 outputs from the random number generator were 
used for the A^ vectors shown in equation (Bl) of appendix B, which is as follows: 


-0.3809 

0.8993 

0.2657 

-0.3644 

-0. 1383 

-0.3259 

-0.5705 

-0.0159 

-0. 1912 

-0.3020 

-0.6430 

-0.6428 

-0. 1861 

-0.7559 

0.2378 

-0.8513 

0.7626 

-0.9064 


Three of the 27 outputs were used as 


-0.7850 


[*tl 


0.9955 


0.8661 


(Bl) 


(El) 
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The remaining six from the 27 outputs were used as 


[E] 


-0. 0506 
0.0834 
-0. 0504 
-0. 0173 
0. 0430 
-0.3244 


(E2) 


With use of the 27 random values shown in equations (Bl), (El), and (E2), the vec- 
tor M was determined by using equation (17), with the result shown by equation (B2) of 
appendix B, which is as follows: 


[M] 


1.3737 
-0. 0505 
0.2161 
-0.9771 
-0.3574 
0.3180 


(B2) 


From this point on, the values in equation (E2) were not used in any way. Thus, the 
true values x^ from which the problem arose were known, but could not be recovered 
from the and M vectors because the same M vector could have resulted from any 
one of an infinity of possible combinations of x^ values. So the true errors relative to 
the combination of x^ values that actually gave rise to the problem could be determined 
and compared with the calculated error distribution. 

The Gram-Schmidt orthonormal set, spanning the S subspace, was determined 
with use of the vectors of equation (Bl) as follows: 
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0. 13699 

0.58117 

0. 53166 

0.34924 

-0. 08939 

-0. 03348 

0. 50292 

-0. 01025 

0. 29560 

0. 40480 

-0. 41554 

-0. 39221 

0.32800 

-0. 48848 

0. 55803 

0.57870 

0. 49281 

-0. 40448 


(E3) 


Continuation of the Gram- Schmidt orthogonalization, with use of U i = u^ (i = 1 to 3) 
gives the vectors CK (i = 1 to 3) spanning the subspace S Q as follows: 


0. 60067 

0. 00000 

0. 00000 

0. 03648 

0. 93144 

0. 00000 

-0.36642 

-0. 16457 

0. 70586 

0.65687 

-0. 23148 

0. 15683 

-0.09609 

-0. 14604 

-0.55841 

-0.25078 

-0. 17441 

-0. 40662 


(E4) 


Either with use of equations (28), (B2), and (E4), or with use of equations (29), (B2), 
and (E3), the vector component E Q was determined as 


From equation (E5), 


[ E ol 


0. 03421 
0. 13275 
0. 00517 
0. 01585 
-0. 06482 
-0. 06705 


E | = 0. 16664 


(E5) 


(E6) 
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The values now, are simply the values x ci of equation (B4) from appendix B, 


reproduced here, 


[xJ = 


-0. 7309 

0. 8832 

1. 0047 


(B4) 


minus the values x.. of equation (El). The ■ values are 

LI cH 

0. 054091 
[r? a ] = -0.11236 
0. 13806 


(E7) 


The A? vectors were given in equation (B3) of appendix B, which is reproduced here: 


[A'] = 


From equation (B3) 


-0. 6036 

0. 2697 

0. 6276 

-0.2753 

-0. 1293 

-0. 0395 

-0.7092 

-0. 1587 

0.3490 

0. 0025 

-0.3009 

-0.4630 

-0.7948 

-0.4671 

0. 6588 

-0. 1386 

0. 2499 

-0. 4775_ 

1 Aj | 

= 1.26255 


1 A 2 1 

= 0. 69698 

► 

l A sl 

= 1. 18054 



(B3) 


(E8) 
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Finally, from equations (E6), (E7), and (E8), 





0. 25710 I 


*»a2 



-0.96743 > 


(E9) 


%3 



0. 70456 J 


Absolute values of the results shown in equations (E9) are equivalent to Monte Carlo 
trial values of r as used in equation (38). Accordingly, from the 10 000 random prob- 
lems solved, 30 000 Monte Carlo trial values were obtained for r. 

The 30 000 trial results were counted by class, with class marks at 0. 1, 0. 3, 0. 5, 
and so on. The normalized counts, or discrete values of approximate probability den- 
sity f(r) are plotted as circular symbols in figure 4. The solid curve in the same figure 
represents equation (38) for n - p + 1 = 4. The abscissa values and corresponding pro- 
bability densities (ordinates) for the absolute values of the results in equation (E9) are 
marked by the three vertical lines in the figure. They contributed to the counts at class 
marks 0.3, 0.7, and 0.9. 



Figure 4. - Probability densities for r for n - 6 and p - 3. 
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